A Portable 65xx Family Assembler Development System
As65, Lk65 and Lb65 form a portable development toolset for 65xx, 65C02 and 65816 assembly language programming. All of the programs are coded in Java (using JDK 1.4.2) and should execute on any compatible JDK or JRE environment that has the JAXP (XML processing) package and a SAX parser (included in the standard SUN Java distribution).
To run the commands in a DOS window you need some simple batch scripts to invoke java and tell it what class to execute. It will look similar to this
@java -cp 65xx.zip uk.co.demon.obelisk.w65xx.As65 %*
If you change the position of the batch script relative to the JAVA 65xx.zip file then the command will fail unless you update the script or copy the zip.
I haven't provided scripts for a UNIX environment, but they would be very similar, something like:
#! /bin/sh
java -cp 65xx.zip uk.co.demon.obelisk.w65xx.As65 $*
The assembler (As65) produces a relocatable object modules by compiling lines of source code held in local files. The format of each source line must follow the pattern shown below. The 'square' brackets enclose optional components within the line (like the label) whilst the '(X|Y)' pattern indicates a choice between types delimited by '|' characters.
[[label[:]] [(opcode|directive|macro) [arguments]]] [; comment]
Opcodes and directives names are case insensitive in source code but labels and macro names are case sensitive. Given this syntax all of the following examples are valid.
; A comment line a_label_by_itself NOP ; Opcode with no argument followed by a comment nop ; Same as above LDA #1 .6502 ; Generate code for 6502 processor MYMACRO 1,2,3 ; Generate a parameterised macro
Labels can be placed before all opcodes or on lines by themselves. A global label is comprised of a letter or underscore ('_') followed by a series of alphanumeric and/or underscore characters. A label may optionally be followed by a colon (':').
A local label has the same grammatical construction as a global label but begins with a period ('.'). Whilst a global label may only be used once with a module a local label may be defined several times provided it appears each time within the scope of a different global label.
SomeGlobalLabel: .ALocalLabel:
Most directives do not allow labels. Those that do give them special meaning (e.g. macro name, symbol name in .EQU and .SET, etc.)
The arguments provided to most opcodes and directives are expression comprised of absolute (e.g. constant literals), relative (e.g. the address of some relocatable instruction or piece of data) and external values (e.g. values defined in other source modules).
The expression parser evaluates operations on absolute values during processing to produce constant values but expressions involving relative and external terms are left for the linker to resolve. The following table shows all the supported operators in decreasing order or precedence.
Operator | Description |
$ ( sub-expression ) number symbol 'character literal' |
Unary values |
+ - ~ ! LO HI BANK |
Unary plus (ignored) Negation Complement Logical Not Bits 7 to 0 Bits 15 to 8 Bits 31 to 16 |
* / % |
Multiply Divide Remainder |
+ - |
Addition Subtraction |
<< >> |
Right Shift Left Shift |
< <= > >= |
Less Than Less Than Or Equal Greater Than Greater Than Or Equal |
== != |
Equal Not Equal |
& | ^ |
Binary AND Binary OR Binary XOR |
&& || |
Logical AND Logical OR |
Expressions may only contain numeric values. There are no string functions.
Literal numeric values can be expressed in binary, decimal, octal, decimal and as character values. Literal values may be up to 32-bits in size and all expressions are evaluated at this precision. Values are masked to 8- and 16- bits when generating code.
LDA #%10101100 ; Load a binary constant LDX #@177 ; Load an octal constant LDA 127 ; Load from a location specified in decimal STA $FFC1 ; Store at a location specified in hexadecimal lda #'X' ; Load ASCII for 'X' into the accumlator .LONG 'ABCD' ; A 32-bit character constant
The assembler currently takes a minimalist approach to directives and opcodes. It does not provide any synonyms for the instructions.
This directive places the assembler in 6501 processor mode. The 6501 processor supports all normal 6502 instructions as well as the extended BBR, BBS, SMB and RMB instructions.
This directive places the assembler in 6502 processor mode. Only the traditional 6502 instructions and addressing modes are supported.
This directive places the assembler in 65C02 processor mode. The 65C02 processor supports all normal 6502 instructions plus new addressing modes and extra some instructions including the BBR, BBS, SMB and RMB instructions,
This directive places the assembler in 65SC02 processor mode. The 65SC02 processor supports the same instructions as the 65C02 BUT does not have then extended BBR, BBS, SMB and RMB instructions,
This directive places the assembler into 65816 processor mode.
The .CODE directive tells the assembler to place any code generated by instructution or data directives into the object files code section.
The .DATA directive tells the assembler to place any code generated by instructution or data directives into the object files initialised data section.
The .BSS directive tells the assembler to place any code generated by instructution or data directives into the object files uninitialised data section.
The .PAGE0 directive tells the assembler to place any code generated by instructution or data directives into a specially marked section that will be located on page 0 ($0000-$00FF on 8-bit CPUs or $000000-$00FFFF on 16-bit CPUs) .
The .ORG directive sets the absolute target address for the current section.
The .DPAGE directive informs the assembler of the assumed value of the direct page register for the following sequence of instructions so that direct-page addressing can be used instead of absolute where possible.
The .DBREG directive informs the assembler of the assumed value of the data bank register for the following sequence of instructions so that absolute address can be used instead of long absolute where possible.
When compiling for the 65816 processor this directive controls the size of immediate values loaded into the accumulator. If a .LONGA ON directive has been processed then 16 bit literals will be generated otherwise they will be 8 bits.
When compiling for the 65816 processor this directive controls the size of immediate values loaded into the X and Y registers. If a .LONGI ON directive has been processed then 16 bit literals will be generated otherwise they will be 8 bits.
Assembles the following source code up to the matching .ELSE or .ENDIF if the constant expression evaluates to a non-zero value.
JSR DoSomething .IF DEBUGGING JSR DumpRegisters .ENDIF JSR DoTheNextBit
Assembles the following source code up to the matching .ELSE or .ENDIF if the expression evaluates to a absolute (i.e. constant) value.
This directive is useful in macros to test the type of the parameter value.
Assembles the following source code up to the matching .ELSE or .ENDIF if the expression does not evaluate to a absolute (i.e. constant) value.
This directive is useful in macros to test the type of the parameter value.
Assembles the following source code up to the matching .ELSE or .ENDIF if the expression evaluates to a relocatable value.
This directive is useful in macros to test the type of the parameter value.
Assembles the following source code up to the matching .ELSE or .ENDIF if the expression does not evaluate to a relocatable value.
This directive is useful in macros to test the type of the parameter value.
Assembles the folloing source code up the matching .ENDIF if the condition for the preceding matching .IF, .IFABS, .IFNABS, .IFREL, .IFNREL directive was not met.
The .ENDIF directive marks the end of condition code section.
Causes the contents of the indicated file to be read and processed before the remainder of the current file.
The current source file is close and processing continues at the first line of the indicated file.
NOP .APPEND "AnotherFile.asm" NOP ; This line will not be processed.
The .END directive marks the end of the source code.
NOP .END NOP ; This line will not be processed.
The .INSERT directive reads the binary contents of the indicated file and inserts it directly into the generated object code.
A typically use is to insert pre-compiled data such as graphics images, encryption keys or lookup tables into the code.
Causes the source lines up to the matching .ENDR directive to repeated the number of times indicated by the constant expression
.REPEAT 8 ; Generate 8 NOPs NOP .ENDR
Marks the end of .REPEAT section.
The .MACRO directive indicates that the following source lines upto the matching .ENDM should be used to define a macro. The name of the macro is taken from the label preceding the .MACRO command.
_NOT16 .MACRO VLA,RES LDA VLA+0 EOR #$FF STA RES+0 LDA VLA+1 EOR #$FF STA RES+1 .ENDM
Macro arguments can be accessed by defining symbolic names for them or by positional references (using \0 thru \9). The sequence \? can be used with a macro to obtain the macro expansion count, for example to generate unique labels for branches within the macro.
Marks the end of a .MACRO definition
When used within a macro it causes an immediate termination of the expansion process.
The .GLOBAL directive lists one or more symbols defined in the current module that can be referenced by code in other modules.
The .EXTERN directive lists one or more symbols defined in other modules so that they can be used in expressions within the current module (e.g. subroutine addresses, key data areas, etc.).
The .BYTE directive deposits a series of 8-bit values into the object code for the current module. The values can be defined as the result of an expression (this includes simple numeric values) or as strings delimited by quotes.
.BYTE "Hello World",$0D,$0A,0
The .LONG directive deposits a series of 16-bit values defined by a series of expressions into the object code for the current module.
.WORD 1,$2,3+5
The .ADDR directive deposits a series of 24-bit values defined by a series of expressions into the object code for the current module.
.ADDR Function1,Function2
The .ADDR directive is primarily intended for creating function jump tables for the 65816 processor.
The .LONG directive deposits a series of 32-bit values defined by a series of expressions into the object code for the current module.
.LONG 1,$2,3+5
The .SPACE directive reserves the specified number of zero valued bytes in the object code.
PTRA .SPACE 2
The .LIST directive enables the output of lines to the listing file.
The .NOLIST directive suspends the generation of a listing.
The .TITLE directive sets the string shown as the title at the top of the listing page.
The .PAGE directive forces the listing to restart at the top of the next page.
The assembler recognizes all the opcodes for the 6501, 6502, 65C02, 65SC02 and 65816 processors but will only generate code for currently selected processor type. Using an inappropriate opcode will generate an error.
Opcode | 6501 | 6502 | 65C02 | 65SC02 | 65816 |
ADC | Y | Y | Y | Y | Y |
AND | Y | Y | Y | Y | Y |
ASL | Y | Y | Y | Y | Y |
BBR0 BBR1 BBR2 BBR3 BBR4 BBR5 BBR6 BBR7 |
Y | Y | |||
BBS0 BBS1 BBS2 BBS3 BBS4 BBS5 BBS6 BBS7 |
Y | Y | |||
BCC | Y | Y | Y | Y | Y |
BCS | Y | Y | Y | Y | Y |
BEQ | Y | Y | Y | Y | Y |
BIT | Y | Y | Y | Y | Y |
BMI | Y | Y | Y | Y | Y |
BNE | Y | Y | Y | Y | Y |
BPL | Y | Y | Y | Y | Y |
BRA | Y | Y | Y | ||
BRK | Y | Y | Y | Y | Y |
BRL | Y | ||||
BVC | Y | Y | Y | Y | Y |
BVS | Y | Y | Y | Y | Y |
CLC | Y | Y | Y | Y | Y |
CLD | Y | Y | Y | Y | Y |
CLI | Y | Y | Y | Y | Y |
CLV | Y | Y | Y | Y | Y |
CMP | Y | Y | Y | Y | Y |
COP | Y | ||||
CPX | Y | Y | Y | Y | Y |
CPY | Y | Y | Y | Y | Y |
DEC | Y | Y | Y | Y | Y |
DEX | Y | Y | Y | Y | Y |
DEY | Y | Y | Y | Y | Y |
EOR | Y | Y | Y | Y | Y |
INC | Y | Y | Y | Y | Y |
INX | Y | Y | Y | Y | Y |
INY | Y | Y | Y | Y | Y |
JML | Y | ||||
JSL | Y | ||||
LDA | Y | Y | Y | Y | Y |
LDX | Y | Y | Y | Y | Y |
LDY | Y | Y | Y | Y | Y |
LSR | Y | Y | Y | Y | Y |
MVN | Y | ||||
MVP | Y | ||||
NOP | Y | Y | Y | Y | Y |
ORA | Y | Y | Y | Y | Y |
PEA | Y | ||||
PEI | Y | ||||
PER | Y | ||||
PHA | Y | Y | Y | Y | Y |
PHB | Y | ||||
PHD | Y | ||||
PHK | Y | ||||
PHX | Y | Y | Y | ||
PHY | Y | Y | Y | ||
PLA | Y | Y | Y | Y | Y |
PLB | Y | ||||
PLD | Y | ||||
PLP | Y | Y | Y | Y | Y |
PLX | Y | Y | Y | ||
PLY | Y | Y | Y | ||
REP | |||||
RMB0 RMB1 RMB2 RMB3 RMB4 RMB5 RMB6 RMB7 |
Y | Y | |||
ROL | Y | Y | Y | Y | Y |
ROR | Y | Y | Y | Y | Y |
RTI | Y | Y | Y | Y | Y |
RTL | Y | ||||
RTS | Y | Y | Y | Y | Y |
SBC | Y | Y | Y | Y | Y |
SEC | Y | Y | Y | Y | Y |
SED | Y | Y | Y | Y | Y |
SEI | Y | Y | Y | Y | Y |
SEP | Y | ||||
SMB0 SMB1 SMB2 SMB3 SMB4 SMB5 SMB6 SMB7 |
Y | Y | |||
STA | Y | Y | Y | Y | Y |
STP | Y | Y | Y | ||
STX | Y | Y | Y | Y | Y |
STY | Y | Y | Y | Y | Y |
STZ | Y | Y | Y | ||
TAX | Y | Y | Y | Y | Y |
TAY | Y | Y | Y | Y | Y |
TCD | Y | ||||
TCS | Y | ||||
TDC | Y | ||||
TRB | Y | Y | Y | ||
TSB | Y | Y | Y | ||
TSX | Y | Y | Y | Y | Y |
TXA | Y | Y | Y | Y | Y |
TXS | Y | Y | Y | Y | Y |
TXY | Y | ||||
TYA | Y | Y | Y | Y | Y |
TYX | Y | ||||
WAI | Y | Y | Y | ||
WDM | Y | ||||
XBA | Y | ||||
XCE | Y |
The 65xx family of processors support a number of different addressing modes which can be used with each instruction.
Syntax | Description | Example |
Implied | NOP | |
A | Accumulator | LSR A |
#expr | Immediate | LDA #'A' |
#<expr | Immediate (lo byte) | LDX #<ADDR |
#>expr | Immediate (hi byte) | LDY #>ADDR |
#^expr | Immediate (bank byte) | LDA #^ADDR |
<expr | Direct | STA <PTR |
<expr,X | Direct Indexed by X | LDA <TBL,X |
<expr,Y | Direct Indexed by Y | LDA <TBL,Y |
>expr | Absolute Long (65816 only) | |
>expr,X | Absolute Long Indexed by X (65816 only) | |
[expr] | Long Indirect (65816 only) | |
[expr],Y | Long Indirect Indexed (65816 only) | |
(expr,X) | Indexed Indirect | |
(expr),Y | Indirect Indexed | |
(expr,S),Y | Stack Relative Indirect Indexed (65816 only) | |
(expr) | Indirect | |
|expr | Absolute | |
|expr,X | Absolute Indexed by X | |
|expr,Y | Absolute Indexed by Y | |
expr | Absolute or Direct | |
expr,X | Absolute or Direct Indexed by X | |
expr,Y | Absolute or Direct Indexed by Y | |
expr,S | Stack Relative (65816 only) |
If the absolute address of the target memory location is known the assembler will attempt to generate the smallest instruction (e.g. direct page instead of absolute). The explicit direct (< expr) and absolute (| expr or !expr) allow the programmer to specify an exact addressing mode for expressions which are not absolute, for example those referencing external symbols.
The assembler can generate code into four different sections (e.g. CODE, DATA, BSS and PAGE0). At the start of each pass the sections are defined as relative. Using the .ORG directive any section can be forced to place code or data at a specific absolute memory address.
.CODE NOP ; A relocatable NOP .ORG $F000 ; Make the section absolute NOP ; Place a NOP at $F000 .DATA ; Switch to the (relative) DATA section .BYTE 1,2,3 .CODE ; Switch back to the absolute code section NOP ; Place a NOP at $F001
You can switch between the sections throughout your code. Any code or data generated will be added where the section was left when it was previously used.
Once a section has been made absolute it can not be made relative again. The .ORG directive can be used multiple times within the same section, for example to reserve memory in different RAM areas.
The assembler supports a simple from of structured programming (e.g. IF..ELSE..ENDIF, REPEAT..UNTIL, etc.) based on the flag bits in the condition register. The assembler will generate the branches needed to implement these control structures without you having to define any labels. It also tries to generate the smallest amount of code using relative branches (e.g. BRA, BEQ, BPL, etc.) when it can, only resorting to JMP when it has to.
Structured code may not be as efficient as normal hand coded routines (due to the extra branched) but this is often outweighed by the enhanced readability and reduction in labels.
An IF command starts a block of code that will only be executed if the indicated condition (e.g. EQ, NE, CC, CS, PL, MI, VC or VS) exists . For example a simple 16-bit increment can be coded as follows
INC VAL+0 IF EQ INC VAL+1 ENDIF
The ELSE command can be used to defined an alternate block of code to be executed if the condition was not true.
AND #$01 IF EQ ; A contained an even number ELSE ; A contained an odd number ENDIF
The REPEAT and UNTIL commands can be used to defined a piece of code that repeats (at least once) until some condition is true. For example the following code counts the bits in A by arithmetically shifting it left until result of the shift is zero.
LDX #0 REPEAT ASL A PHP IF CS INX ENDIF PLP UNTIL EQ
If you want a loop that repeats endlessly then use the FOREVER keyword at the end instead of UNTIL.
The WHILE and WEND commands produce a block of code that will repeat while some condition is true.
WHILE EQ ENDW
Both the REPEAT and WHILE loops can contain the loop modifiers BREAK and CONTINUE.
The BREAK command generates a branch to the next instruction immediately after the matching UNTIL, FOREVER or ENDW.
LDX #0 REPEAT CPX #7 IF EQ BREAK ENDIF INX FOREVER
Similarly the CONTINUE generates a branch back to the start of REPEAT or WHILE loop to force the start of the next iteration.
The linker Lk65 assembles a selection of individually assembled object files and libraries into a complete application and writes it out as a single HEX, S19 or binary file. The syntax of the link command is
Lk65 [-65C816][-code/data/bss regions][-bin|hex|s19][-output <file>] object ... library ...
The linker will complain if an output format has not been specified.
The -65C816 option informs the linker that it is building for this processor. The code, data and bss options tell the linker where these areas will be in memory and consist of a series of memory addresses. For example my home brew has a 4K ROM from $F000 to $FFFF but one page ($FE00-$FEFF) is reserved for I/O hardware mapping, so I would tell the linker to place code as follows:
Lk65 -code $F000-FD00,$FF00-$FFFF rtos.obj
You need only define the areas for the section types you have used. If the linker finds a section type that you have not specified it will produce an error (and crash at the moment) and you can add a definition to the command line and rerun the link.
At the moment the code, data and bss areas should not overlap! Additionally the bss page should be somewhere other than page zero.
The linker can produces a binary file by default or when -bin is specified on the command line. You can generate a HEX file by specifying -hex instead.
The library application allows a collection of compiled modules to be converted into a single library file. The linker scans libraries during the link process and removed only the modules needed to satisfy missing references. The command line syntax for the librarion is shown below
Lb65 [-create|update|remove|list] library [object ...]
The librarian supports four actions, namely:
The example ctype library is built using the following command.
Lb65 -create ctype.lib is*.obj to*.obj
I've been meaning to write a relocating 6502 assembler for about 22 years but never seemed to have the time to get around to it. Before writing this package I tried out all kinds of demos and shareware but none of them was quite what I was looking for so I finally put the fingers to the keyboard.
All of the applications are 65XX specialised versions of an underlying more generic application. One day I might bolt a Z80/GameBoy Color or ARM/THUMB lexical analyser on the top. Send me an e-mail if you would like a copy of the sources.
The data format used to represent the object modules and libraries is XML. It might not be the most efficient way to represent object modules but in this day and age who cares about a few extra bytes and machine cycles. It's also so much easier to debug than a binary format.
Andrew Jacobs, January 2008